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into a set of RNA secondary structures which are well conserved, and reverse genetic studies indicate 
that these structures play an important role in the discontinuous synthesis of subgenomic RNAs in the 
betacoronaviruses. These cis-acting elements extend 3’ of the 5’UTR into ORF1a. The 3’UTR is similarly 
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conserved and contains all of the cis-acting sequences necessary for viral replication. Two competing 
conformations near the 5’ end of the 3’UTR have been shown to make up a potential molecular switch. 
There is some evidence that an association between the 3’ and 5’UTRs is necessary for subgenomic RNA 
synthesis, but the basis for this association is not yet clear. A number of host RNA proteins have been 
shown to bind to the 5’ and 3’ cis-acting regions, but the significance of these in viral replication is not 
clear. Two viral proteins have been identified as binding to the 5’ cis-acting region, nsp1 and N protein. 
A genetic interaction between nsp8 and nsp9 and the region of the 3’UTR that contains the putative 
molecular switch suggests that these two proteins bind to this region. 
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1. Introduction 


Coronaviruses (CoVs) are an important cause of illness in 
humans and animals. Most human coronaviruses commonly cause 
relatively mild respiratory illnesses; however two zoonotic coro- 
naviruses, SARS-CoV and MERS-CoV, can cause severe illness and 
death. Investigations over the past 35 years have illuminated many 
aspects of coronavirus replication. The focus of this review is the 
structural and functional analyses of conserved RNA secondary 
structures in the 5’ and 3’ of the betacoronavirus genomes. 


1.1. Classification and pathogenicity of coronaviruses 


Coronaviruses belong to the subfamily Coronavirinae (http:// 
ictvonline.org/virusTaxonomy.asp?version=2012), which together 
with Torovirinae make up the Coronaviridae family in the order 
Nidovirales. The Coronavirinae are classified into four genera 
Alphacoronavirus, Betacoronavirus, Deltacoronavirus and Gam- 
macoronavirus. Torovirinae includes two genera, Torovirus and 
Bafinivirus. The Coronaviridae comprises a group of evolution- 
ary related single-stranded, positive-sense, non-segmented, 
enveloped RNA viruses of vertebrates. The RNA genomes are 
25-31 kb, the largest genomes of all known RNA viruses, and are 
infectious when introduced into permissive cells. However, unlike 
those of almost all other positive-strand RNA viruses, the RNA 
infectivity of transfected coronavirus genomes is greatly increased 
in the presence of a source of N protein (Casais et al., 2001; 
Grossoehme et al., 2009; Yount et al., 2000). Alphacoronaviruses 
include alphacoronavirus 1 (transmissible gastroenteritis virus, 
TGEV), porcine epidemic diarrhea virus (PEDV), bat coronavirus 
1, BtCoV 512, BtCoV-HKU8, BtCoV-HKU2, human coronavi- 
rus HCoV-NL63 and HCoV-229E. Gammacoronaviruses include 
avian coronavirus and whale coronavirus SW1. Deltacoronaviruses 
include coronavirus HKU11, HKU12, and HKU13. The major empha- 
sis of this review is on the betacoronaviruses, which has been 
the most studied genus. Within the genus Betacoronavirus, four 
lineages (A, B, C, and D) each with a unique set of accessory genes 
are commonly recognized. Lineage A includes HCoV-OC43 and 
HCoV-HKU1, betacoronavirus 1 (more commonly known as bovine 
coronavirus, BCoV), murine coronavirus (MHV); Lineage B includes 
severe acute respiratory syndrome-related SARS-CoV and various 
species recovered from bats; Lineage C includes Tylonycteris 
bat coronavirus HKU4 (BtCoV-HKUA4), Pipistrellus bat coronavirus 
HKU5 (BtCoV-HKUS). Since April 2012, the Middle East Respiratory 
Syndrome MERS-CoV has emerged as anew member in lineage C of 
the betacoronaviruses, closely related to bat coronaviruses HKU4 
and HKU5 (de Groot et al., 2013; Drexler et al., 2014; Zaki et al., 
2012). MERS-CoV is the first Betacoronavirus lineage C member 
isolated from humans. Lineage D includes Rousettus bat corona- 
virus HKU9 (BtCoV-HKU39), which has only been detected in bats 
(http://www.ecdc.europa.eu/en/publications/Publications/novel- 
coronavirus-rapid-risk-assessment-update.pdf). 

Coronaviruses (CoVs) cause respiratory, enteric, hepatic and 
neurological diseases in a broad range of vertebrate species (Stadler 
etal., 2003; Weiss and Leibowitz, 2007). Most human coronaviruses 
commonly cause relatively mild respiratory disease, however two 
coronaviruses, SARS-CoV (Rota et al., 2003) and MERS-CoV (Zaki 


et al., 2012) can cause severe illness and death. SARS-CoV was 
first recognized in China in November 2002 causing a worldwide 
outbreak including 774 deaths from 2002 to 2003. MERS-CoV is 
a novel coronavirus first reported in Saudi Arabia in 2012 and 
has caused illness in hundreds of people from several countries 
(http://www.cdc.gov/coronavirus/about/index.html). As of July 23, 
2014, 837 laboratory-confirmed cases of MERS-CoV infection have 
been reported by WHO, including 291 deaths. Both SARS-CoV and 
MERS-CoV are thought to have originated in bats and spread to 
humans through intermediate hosts (Coleman and Frieman, 2014). 
Human coronaviruses also have been detected in human CNS and 
are able to replicate in CNS derived cells (Arbour et al., 1999; Murray 
et al., 1992) as well as having been isolated from patients with gas- 
troenteritis and diarrhea (Gerna et al., 1985; Resta et al., 1985), and 
more seriously, causing neonatal necrotizing enterocolitis (Rousset 
et al., 1984). 


2. Genome organization and replication 
2.1. Genome organization 


Coronaviruses are roughly spherical with a fringe of large, bulb- 
ous surface projections. Coronaviruses infect cells primarily by 
binding of the spike protein to its host specific cell receptors 
(Delmas et al., 1992; Hofmann et al., 2005; Li et al., 2003; Raj 
et al., 2013; Williams et al., 1991; Yeager et al., 1992). Virus enters 
cells by fusion at the cell surface or by an endocytotic pathway 
depending upon the strain of virus and the target cell (Nash and 
Buchmeier, 1997; Wang et al., 2008). After entering the cytoplasm 
and uncoating, the virus particle releases the RNA genome. For all 
coronaviruses the genomes are organized into 5’ non-structural 
protein coding regions comprising the replicase genes, which are 
two-thirds of the genome, and 3’ structural and nonessential acces- 
sory protein coding regions (Masters, 2006). Infected cells contain 
seven to nine virus specific mRNAs with coterminal 3’ ends, the 
largest of which is the genomic RNA (Masters, 2006). All of the 
mRNAs carry identical 70-90 nts leader sequences at their 5’ ends 
(Lai et al., 1983, 1984; Leibowitz et al., 1981; Spaan et al., 1982). 
The 3’ end of the leader sequence contains the transcriptional reg- 
ulatory sequence (TRS-L), which is also present in the genome 
just upstream of the coding sequence for each transcription unit 
[TRS-B (body)], where it acts as a cis-regulator of transcription 
(Budzilowicz et al., 1985). All coronavirus TRSs include conserved 
6-8 nucleotides core sequence (CS) plus variable 5’ and 3’ flanking 
sequences (Sola et al., 2005). Betacoronaviruses contain a consen- 
sus heptameric sequence, 5’-UCUAAAC-3’, with the SARS-CoV TRS 
having 5’-ACGAAC-3’ as the core sequence (Marra et al., 2003; Rota 
et al., 2003). Replication occurs shortly after entry and uncoat- 
ing of the virion through production of full-length genomic and 
subgenomic negative strand intermediates (Baric and Yount, 2000; 
Sawicki and Sawicki, 1990; Sethna et al., 1989). Translation of 
subgenomic mRNAs gives rise to structural and nonstructural viral 
proteins. The replicated RNA genome is then encapsidated and 
packaged into virions. A minimal 69-nts packaging signal has been 
characterized in MHV that maps within ORF1b approximately 20 kb 
from the 5’ end of the genome, which is sufficient for RNA to be 
incorporated into virions (Fosmire et al., 1992; Kuo and Masters, 
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2013; Makino et al., 1990; Narayanan et al., 2003; van der Most 
et al., 1991). The BCoV packaging signal Exhibits 74% sequence 
identity to the MHV packaging signal and is located in a similar 
position (Cologna and Hogue, 2000). The TGEV packaging signal 
was originally mapped to the first 649 nts at the 5’ end of the 
genome; subsequently this position was further delimited to the 
first 598 nts (Escors et al., 2003; Morales et al., 2013). Viruses bud 
into smooth walled vesicles in the endoplasmic reticulum-Golgi 
intermediate compartment (ERGIC). After budding, virus particles 
mature in the Golgi, with a compact, electron-dense internal core. 
Viruses traverse the Golgi and are transported in exocytic vesicles 
which eventually fuse with the plasma membrane to release virus 
into the extracellular space (Holmes and Lai, 1996). 


2.2. Genomic and subgenomic mRNAs and their encoded proteins 


Coronavirus messenger RNA 1, which is genome length, 
containing two overlapping reading frames ORF 1a and 1b, 
directs the synthesis of two precursor polyproteins ppla and 
pplab, via a —1 frameshifting mechanism involving a pseudo- 
knot structure (Bredenbeek et al., 1990). The polyproteins are then 
processed by two or three virus-encoded (in ORF1a) proteinase 
domains to produce a membrane-bound replicase-transcriptase 
complex (Brockway et al., 2003). Upon proteolytic processing, the 
frameshifted ORF 1ab polypeptide generates 15-16 nonstructural 
proteins, many of which are involved in either RNA synthesis or 
proteolytic processing required for viral replication: nsp1-nsp11 
encoded in ORF 1a and nsp12-16 encoded in ORF1b (Ziebuhr et al., 
2000). ORF 1a encodes three protease domains, one or two papain- 
like domains in nsp3 depending on the particular coronavirus, and 
one picornavirus 3C-like domain in nsp5 (Schiller et al., 1998; Weiss 
etal., 1994). Nsp8 in ORF 1a contains a second RNA-dependent RNA 
polymerase (RdRp) domain that is proposed to function as a pri- 
mase and produce primers utilized by the primer-dependent nsp12 
RdRp (Imbert et al., 2006). ORF 1b encodes an RNA-dependent RNA 
polymerase core unit in nsp12 (although a variable number of nts 
of the nsp12 coding sequence lies within ORF 1a, depending on 
the particular virus), a superfamily 1 helicase in nsp13, an exonu- 
clease and N-methyltransferase in nsp14, an endoribonuclease 
in nsp15, and an S-adenosylmethionine-dependent 2’-O-methyl 
transferase in nsp16 (Bhardwaj et al., 2004; Chen et al., 2009; 
Cheng et al., 2005; Ivanov and Ziebuhr, 2004; Lee et al., 1991; 
Pinon et al., 1999; Putics et al., 2005; Snijder et al., 2003). The 
enzymatic activities of the exonuclease, endoribonuclease and S- 
adenosylmethionine-dependent 2’-O-methyl transferase encoded 
by nsp14, 15 and 16 are unique to Nidoviruses. Subgenomic mRNAs 
encode the major viral structural proteins including spike proteins 
(S), envelope protein (E), membrane protein (M), nucleocapsid pro- 
tein (N), and the accessory proteins. Spike protein binds to the 
specific receptor on host cell plasma membranes. S is a class I 
fusion protein inducing cell fusion. In some betacoronaviruses and 
all gammacoronaviruses, the precursor polypeptide is cleaved by 
a cellular protease into noncovalently associated amino-terminal 
S1 and carboxy-terminal S2 subunits. The receptor binding domain 
(RBD) of MHV S1 determines receptor specificity, and S2 contains 
the transmembrane domain and two heptad repeat regions (HR1 
and HR2) required for fusion activity (McRoy and Baric, 2008). On 
the other hand, the spike protein is uncleaved in most alphacoro- 
naviruses and the betacoronavirus SARS-CoV. Both E and M are 
required for normal virion assembly in MHV (Fischer et al., 1998; 
Maeda et al., 2001), but E protein is not required for assembly for 
all coronaviruses (DeDiego et al., 2007; Kuo and Masters, 2003). 
Nucleocapsid protein binds to viral mRNA genome to form the 
ribonucleoprotein complex and also displays an RNA chaperone 
activity in vitro (Baric et al., 1988; Grossoehme et al., 2009; Masters, 
2006; Nelson et al., 2000; StohIman et al., 1988; Zuniga et al., 2007). 


This RNA chaperone activity has been proposed to have an impor- 
tant role in genome replication and sgRNA transcription (Zuniga 
etal., 2007). N proteins contain two structurally independently RNA 
binding domains, the N-terminal RNA binding domain (NTD) and 
a C-terminal domain (CTD, residues 256-385) which also has RNA 
binding activity, joined by a charged linker region rich in serine and 
arginine residues (SR linker) (Chang et al., 2009; Grossoehme et al., 
2009). The NTD makes a specific and high affinity complex with 
the TRS or its complement (cTRS) and fully unwinds a TRS-cTRS 
duplex that plays a critical role in subgenomic RNA synthesis and 
other processes requiring RNA remodeling (Cologna et al., 2000; 
Grossoehme et al., 2009; Hurst et al., 2009; Zuniga et al., 2010). 
The N3 domain (residues 409-454) which extends to the true C- 
terminus of the N protein plays a role in determining N-membrane 
protein interaction in MHV (Hurst et al., 2005). 


2.3. Viral RNA synthesis 


Viral RNA synthesis occurs in the cytoplasm on double-walled 
membrane vesicles (Gosert et al., 2002; Knoops et al., 2008). During 
MHV RNA replication and transcription of subgenomic RNAs, the 
genomic RNA serves as a template for the synthesis of full-length 
and subgenomic negative-strand RNAs, the latter through a dis- 
continuous transcription mechanism (Sawicki and Sawicki, 1990, 
1998; Sola et al., 2005; van Marle et al., 1999; Zuniga et al., 2004). 
In turn, full-length negative-strand RNAs serve as templates for the 
synthesis of genome RNA and negative strand subgenomic RNAs 
serve as the templates for subgenomic mRNA synthesis. In this dis- 
continuous transcription model, negative-strand subgenomic RNAs 
are transcribed from a genome-length template and leader-body 
joining is accomplished during the synthesis of negative-strand 
subgenomic RNAs through a copy-choice like mechanism involv- 
ing TRS-B and TRS-L sequences (Pasternak et al., 2003; Sawicki and 
Sawicki, 1990, 1998; van Marle et al., 1999; Zuniga et al., 2004). 
In an elaboration of this model, viral and/or cellular factors bind- 
ing to cis-acting RNA elements in the genomic RNA 5’ untranslated 
region (5’UTR) and 3’UTR, might circularize the genome, promoting 
template switching by topologically enabling base pairing between 
TRS-L and the nascent complementary TRS-Bs by the viral tran- 
scriptase/replicase complex (TRC) (Sola et al., 2005; Zuniga et al., 
2004). Recently Mateos-Gomez et al. (2013) characterized a long- 
distance RNA-RNA interactions within the genomic RNA of TGEV, 
that serves as a transcriptional enhancer by bringing the TRS-L and 
TRS-B controlling N gene transcription into close proximity, raising 
the possibility that other similar long range RNA-RNA interac- 
tions might be present in other abundantly transcribed subgenomic 
RNAs. 


3. 5/-cis-Acting RNA elements in coronavirus replication 
and transcription 


The coronavirus RNA 5’UTR, the adjacent sequences encod- 
ing the amino-terminus of nonstructural protein1 (nsp1) and the 
3/UTR contain cis-acting sequences that fold into secondary and 
higher-order structures, which contribute to their stability and 
to their involvement in inter- and intra-molecular interactions. 
Those structures are functionally important for RNA-RNA inter- 
actions and for the binding of viral and cellular proteins during 
RNA replication and translation (Brian and Baric, 2005; Liu and 
Leibowitz, 2010; Liu et al., 2009b) but cannot be complemented in 
trans. Many cis-acting sequences and their functional roles in viral 
transcription and replication were initially defined and studied in 
defective interfering (DI) RNAs (Chang et al., 1994; Dalton et al., 
2001; Kim and Makino, 1995; Liu et al., 2001; Makino et al., 1985, 
1988; Raman et al., 2003; Raman and Brian, 2005). These DI RNAs 
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are largely deleted but contain the cis-acting sequences necessary 
for their replication, including the 5’UTR and 3’UTR. DI RNAs repli- 
cate by using the RNA synthesis machinery of helper virus which 
provides replicase components in trans, and often interfere with 
viral genomic RNA replication. Several studies showed that approx- 
imately 400 nts to 800 nts at the 5’ end and 400 nts at the 3’ end of 
MHV RNA genome are necessary for DI RNA replication (Kim et al., 
1993; Lin and Lai, 1993; Luytjes et al., 1996). The minimal length of 
5’ sequence that supported MHV DI replication is 467 nts (Luytjes 
et al., 1996); the 3’UTR contains all the 3’ cis-acting sequences nec- 
essary for this process (de Haan et al., 2002; Goebel et al., 2004a). 
In BCoV the 5’ 498 nts act as a cis-acting signal for DI RNA repli- 
cation (Chang et al., 1994). The development of reverse genetic 
systems for a number of coronaviruses has allowed the study of 
cis-acting sequences and their functions in the context of the whole 
viral genome. 

MHV, BCoV and SARS-CoV are closely related in the Betacoro- 
navirus genus (Gorbalenya et al., 2006). The secondary structures 
in the 5’-end-proximal genomic regions of these three viruses are 
largely conserved even though the nucleotide sequences are rela- 
tively divergent (Chen and Olsthoorn, 2010; Guan et al., 2012). A 
series of studies by consensus covariation modeling, chemical pro- 
bing, SHAPE technology, and nuclear magnetic resonance (NMR) 
spectroscopy in conjunction with reverse genetics have been car- 
ried out, in order to characterize the predicted secondary structures 
of cis-acting sequences in the 5’UTR and the N-terminal nsp1 cod- 
ing region of MHV and BCoV and to identify their functional roles 
in viral replication (Chen and Olsthoorn, 2010; Guan et al., 2011, 
2012; Li et al., 2008; Liu et al., 2007, 2009a; Yang et al., 2011, 2015). 


3.1. Secondary structure models 


The 5’-most 140 nts of the MHV genome have been predicted 
to contain three conserved stem-loops (SL) 1, 2 and 4 based on 
a consensus secondary structural model from nine representative 
coronaviruses using phylogenetic analysis, ViennaRNA, Mfold and 
PKNOTS (Kang et al., 2006a; Liu et al., 2007). The nine coronaviruses 
modeled include five Beta-CoVs: BCoV, human coronavirus HCoV- 
0C43, HKU1, SARS-CoV, MHV-A59; three Alpha-CoVs: HCoV-NL63, 
HCoV-229E and TGEV; and one Gamma-CoV, IBV. The secondary 
structural models of the 5’ 140 nts of these coronaviruses are 
remarkably similar, and all contain three conserved helical stems, 
SL1, SL2, and SL4. SARS-CoV and BCoV are also predicted to contain 
an additional stem-loop, SL3, which folds the leader TRS (TRS-L) 
sequences into a hairpin loop. For MHV, a similar base pairing 
scheme for SL3 can be drawn (Chen and Olsthoorn, 2010), but the 
SL3 stem is not predicted to be stable at 37 °C (Liu et al., 2007). The 
secondary structures of the 5’UTR and adjacent coding sequences in 
all CoVs predicted by a structural-phylogenetic analysis (Chen and 
Olsthoorn, 2010) are largely consistent with the models by Kang 
et al. (2006a) and Liu et al. (2007). 

In a recent study (Yang et al., 2015), SHAPE technology was uti- 
lized to determine the secondary structures formed by the 5’ most 
474 nts cis-acting region required for replication of MHV-A59 DI 
RNA. The structure generated (Fig. 1A) was in excellent agreement 
with our previous characterization of SL1 (Li et al., 2008), SL2 (Liu 
et al., 2007, 2009a) and the correct structure for SL4 at position 
nts 80-130 (Yang et al., 2011). These stem-loops serve as cis-acting 
elements required for driving subgenomic RNA synthesis. Interest- 
ingly, the stem predicted for SL3 by phylogenetic algorithms (Chen 
and Olsthoorn, 2010) in the TRS region was single-stranded with 
relatively high SHAPE reactivity, consistent with the prediction of 
Liu et al. (2007) that this region was weakly paired or unpaired. The 
structure was also in good agreement with the two recent models 
of MHV-A59 RNA secondary structure (Guan et al., 2011, 2012) that 
identified two additional cis-acting replication elements required 


for optimal viral replication. S5 (Fig. 1), which contains a long-range 
RNA-RNA interaction (nts 141-167 base paired with nts 363-335) 
was equivalent to the base pairing (nts 141-170 paired with nts 
363-332) predicted by Guan et al. (2012), with the exception of 
nts 332-334, which are unpaired in the Guan model but are part 
of the SL5C stem in the SHAPE-informed model (Fig. 1A). SL5A (nts 
171-225) was identical to the stem-loop designated as SLIV previ- 
ously (Brown et al., 2007; Guan et al., 2011). Furthermore, SL5A is 
remarkably similar to a stem-loop, predicted and designated SL5 by 
Chen and Olsthoorn (2010) based upon a structural-phylogenetic 
analysis of the 5’UTRs of betacoronaviruses. SHAPE analysis has also 
provided biochemical support for SL5B, SL5C, SL6 and SL7, which, 
in MHV, previously lacked support from genetic or biochemical 
studies. 

Previously, the Brian lab defined four stem loops, SLI, II, III and 
IV, within the 210 nts 5’UTR of BCoV using the Mfold algorithm and 
their existence is supported by enzymatic probing and mutational 
analysis in DI RNA replication assays (Chang et al., 1994; Raman 
et al., 2003; Raman and Brian, 2005). It should be noted that the 
four stem loops defined by the Brian lab differ from the structures 
predicted by our group, except that SLIII in the Brian model is almost 
identical with the predicted SL4b in MHV (Kang et al., 2006a). 

Fig. 1 shows a comparison of the secondary structure of 
MHV-A59 informed by SHAPE analysis (Fig. 1A) with the most ther- 
modynamically stable models of the 5’ most 474 nts of BCoV-Mebus 
(Fig. 1B), SARS-CoV (Fig. 1D), and MERS-CoV (Fig. 1C), generated by 
RNAstructure software without incorporating any SHAPE reactiv- 
ity data. In general, the overall configuration of the three models 
is similar despite the relatively high divergence of the nucleotide 
sequence. A core feature is a conserved SL5ABC four-helix junction, 
a finding strongly consistent with a conserved core architecture 
(Laing and Schlick, 2009). The initiation codon for nsp1 is in a simi- 
lar position in SL5A in the two lineage A beta-CoVs, MHV and BCoV, 
whereas the initiation codon for SARS-CoV (lineage B) nsp1 and 
MERS-CoV (lineage C) are in more 3’ positions, in the first part of 
the 3’ side of S5 for SARS-CoV and in SL5B for MERS-CoV, as shown in 
Fig. 1. All four models contain the long-range base-pairing between 
5’UTR sequences and the nsp1 coding region that make up the S5 
stem. The lengths of connecting single-stranded junctions between 
the helices are also generally similar or identical, although the sin- 
gle stranded junctions between SL5A and the other two helices (S5 
and SL5B, see Fig. 1) are shorter for BCoV than they are in MHV 
or SARS-CoV. The majority of the RNAstructure-based model for 
BCoV-Mebus (Fig. 1B) is in good agreement with the model for 
the 5’UTR and N-terminal nsp1 coding region of BCoV generated 
by Mfold modeling, phylogenetic covariation, and biochemical and 
genetic studies (Brown etal., 2007; Chen and Olsthoorn, 2010; Guan 
et al., 2011, 2012; Kang et al., 2006a; Liu et al., 2007; Raman et al., 
2003; Raman and Brian, 2005), although the nomenclature is not 
equivalent. Parts of the RNAstructure-based model for SARS-CoV 
(Fig. 1D) are also similar to the model for the 5’ proximal sequence 
of SARS-CoV developed through structural-phylogenetic analysis, 
especially the presence of the substructures, SL5A, 5B and 5C which 
have been proposed to function in genome packaging (Chen and 
Olsthoorn, 2010). Following the S5 helical stem, the structures gen- 
erated by RNAstructure are more divergent among the four viruses 
with MHV-A59 and BCoV-Mebus being relatively similar, although 
SL6 and SL7 are predicted to be in a forked stem-loop structure 
for BCoV whereas for MHV they form separate stem loops, and 
for SARS-CoV, three stem-loops, SL6, SL7 and SL8, are predicted. 
In MERS-CoV this region is predicted to fold into two bulged stem- 
loop structures, although their structure and position differs from 
that of MHV. 

SHAPE analysis for in vitro transcribed and refolded RNA and 
ex virio genomic RNA in the 5’ cis-acting region generated iden- 
tical secondary structures (Yang et al., 2015) and was generally 
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Fig. 1. Comparison of secondary structure models of the 5’ regions of MHV-A59, BCoV, MERS-CoV and SARS-CoV. (A) The MHV-A59 model was generated by SHAPE analysis. 
(B) The thermodynamically most stable models of the corresponding regions of BCoV-Mebus, (C) MERS-CoV (Al-Hasan 3), and SARS-CoV (D) were generated by RNAstructure. 
The gray and italicized text denotes the core leader TRS regions. The gray AUGs represent the start codons of the short open reading frames (heavy black lines) that are 
upstream from the nsp1 initiating AUG codons (pink shading) in all four viruses. The nsp1 open reading frames are indicated by the red lines. Note that the nomenclature for 
BCoV is not equivalent to that of previous studies by the Brian Laboratory (see text); SL1 and SL2 corresponds to SLI, SL3 to SLII, SL4 to SLIII, SL5A to SLIV, SL5B to SLV, SL5C 


to SLVI, S5 to a long-range RNA-RNA interaction, and SL6 to SLVIII. 


consistent with previous studies of RNA secondary structure in 
this region. This gives us confidence that this methodology will 
be useful to explore the fine and long range secondary structures 
in the context of genomic RNA from authentic viral particles, 
particularly to determine the interactions of the intergenic regions 
and their flanking sequences which are likely to play a role in 
regulating template switching in MHV sgRNA synthesis. 

The four betacoronavirus RNAstructure-based models described 
above share the conserved SL1, SL2, SL4 and SL5ABC secondary 
structure elements with the sequence covariation-based models 
of the CoV 5/UTR (Chen and Olsthoorn, 2010; Kang et al., 2006a; 


Liu et al., 2007). This conservation of SL1, SL2, SL4, and SL5ABC 
among the betacoronaviruses and SL1, SL2, and SL4 among alpha, 
beta, and gamma, coronaviruses suggests that these secondary 
structures serve as common cis-acting signals important in CoVs 
replication and viral RNA synthesis. Functional analyses of individ- 
ual structural elements described below suggest that the structural 
features of SL1, SL2 and SL4 are more important than the precise 
nucleotide sequences in these stem-loops. However it should be 
noted that in BCoV and SARS-CoV SL3 overlaps TRS-L, which defines 
the leader-body junction region for discontinuous sgRNA synthe- 
sis, while SHAPE analysis indicates that the corresponding region is 
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single-stranded in MHV, consistent with the prediction of Liu et al. 
(2007). Thus for this region primary sequence is the functionally 
important feature. 


3.2. Functional studies of individual structural elements 


Although the RNA secondary structure of the 5’UTR of coro- 
Nnaviruses appears to be conserved across at least three genera, 
functional analyses of the individual stem-loop structures have 
only been performed on two betacoronaviruses, MHV and BCoV. 
The individual stem-loop structures contained in the 5’UTRs of 
related betacoronaviruses are largely interchangeable, with the 
exception of their TRSs (Kang et al., 2006a). The HCoV-OC43 SL1 
can functionally replace the MHV SL1 (Li et al., 2008). SARS-CoV 
SL1, SL2, and SL4 can separately replace their corresponding MHV 
counterparts in the MHV genome, however, the MHV chimera is 
not viable when the entire MHV 5’UTR is replaced by the SARS- 
CoV 5'UTR even when the MHV leader TRS is substituted for the 
SARS TRS (Kang et al., 2006a). Replacement of each of four 5’ end 
cis-acting stem-loops, SL1, SL2, SL4 and SL5A in the MHV genome 
with their counterparts in the BCoV genome yields chimeric viruses 
with near-wild-type MHV phenotypes. However, the 32 nts (nts 
142-173, see Fig. 1) which make up the 5’ side of S5 in the BCoV 
5’/UTR and which were predicted to be single-stranded in an ear- 
lier thermodynamic and phylogenetic model (Guan et al., 2011), 
cannot directly replace the equivalent sequences in MHV without 
genetic adaptation (Guan et al., 2011, 2012). These adaptations sug- 
gested that a long range interaction between nts 141-170 with nts 
332-363 (S5, Fig. 1A) is required for optimal viral replication (Guan 
et al., 2012). 

In addition, Fig. 1 shows the near identity of the location of 
the initiating AUG for MHV-A59 and BCoV-Mebus in SL5A, and 
the more 3’ location for the SARS-CoV initiation codon in the 3’ 
side of S5. An attempt to create an MHV chimeric virus in which 
the entire MHV 5’UTR was replaced by the SARS-CoV 5’UTR failed, 
even when the MHV leader TRS was substituted for the SARS-CoV 
TRS (Kang et al., 2006a). The likely explanation for this failure is 
the different locations for the initiating AUGs of the two viruses. 
Thus the replacement of the MHV 5’UTR with the SARS-CoV 5’UTR 
would disrupt the long-range interaction in S5. Similarly, the BCoV 
32 nts region (nts 142-173) could not directly replace the corre- 
sponding 30 nts sequences in MHV-A59 without genetic adaptation 
(Guan et al., 2011), which is likely explained by partial disruption 
of base-pairing in S5, a long-range interaction between this region 
(nts 142-173) and sequences approximately 200 nts downstream 
within the nsp1 coding sequence in both MHV and BCoV. 

A new member of the betacoronaviruses, MERS-CoV, is a close 
relative of bat coronaviruses HKU4 and HKU5 (Drexler et al., 2014). 
Prediction with RNAstructure (Mathews et al., 2004) showed that 
the most stable secondary structures in the 5’ end 350 nts region for 
MERS-CoV and Bat CoV HKU5-1 are very similar and are also similar 
to those for MHV, BCoV and SARS-CoV, containing conserved stem- 
loops SL1, SL2, SL4 and SL5ABC which contains a four-helix junction 
in the models for all four viruses (Fig. 1). One difference in the MERS- 
CoV predicted structure is the presence of a short stem-loop located 
at positions 178-190 between SL4 and SL5ABC that is unlabeled in 
Fig. 1C. The functional significance of this stem-loop remains to 
be investigated. Both MERS-CoV and Bat CoV HKU5-1 have single- 
stranded region between SL2 and SL4. 


3.2.1. SL1 

The MHV 5/UTRSL1 has been shown to be functionally and struc- 
turally bipartite by a detailed mutational and biophysical study 
(Li et al., 2008). Two pyrimidine-pyrimidine non-canonical base 
pairs divide the SL1 helical stem into upper and lower parts. The 
upper part of SL1 is required to be base-paired for efficient virus 


replication. Mutations that destroy base pairing of this region are 
lethal or generate viruses with impaired phenotypes. Compen- 
satory mutations introduced in both sides of the stem that restored 
the base pairing produce viable viruses with phenotypes similar 
to wild type virus. In contrast, the viruses containing mutations 
that destroy base pairing in the lower region of SL1 are viable, and 
the compensatory mutations predicted to restore base pairing are 
lethal, suggesting that the sequences rather than base pairing in 
the lower portion of the SL1 stem are required. Genomes carrying 
lethal mutations in SL1, fail to direct the synthesis of minus-sense 
subgenomic RNA implying that this element has a crucial role in 
this process, possibly related to template switching events during 
leader-body joining (Li et al., 2008). Deletion of a bulged A35 in 
the lower portion of the stem increases the thermal stability of 
the helix. Recovered viruses all contained destabilizing second-site 
mutations near the A35 deletion, suggesting that structural labil- 
ity of the lower region of SL1 is important for virus replication. 
The recovered viruses also contain additional mutations, A29G or 
A78G in the 3’UTR, providing genetic evidence supporting an inter- 
action between the 5’ and 3’/UTRs, as hypothesized in one model 
of coronavirus subgenomic mRNA synthesis (Zuniga et al., 2004). A 
dynamic SL1 model is proposed in which the base of SL1 has an opti- 
mized lability required to mediate a physical interaction between 
the 5'UTR and the 3’UTR that stimulates subgenomic RNA synthesis 
(Li et al., 2008). 


3.2.2. SL2 

SL2 is the most conserved secondary structure in the corona- 
virus 5’UTR, with a 5 nt long stem capped with a (C/U)UUG(U/C) 
pentaloop being the single most conserved sequence in the 5'UTR 
(Liu et al., 2007). Replacing U48 with C or A is lethal, but the virus 
containing a U48G mutation is viable and grows with nearly a WT 
phenotype. Initial NMR studies of SL2 suggested that the imino pro- 
ton of U48 in the WT sequence or G48 in a U48G mutant donates 
a hydrogen bond suggesting that it might stabilize a U-turn like 
conformation, which is consistent with the genetic results (Liu 
et al., 2007). However, further NMR study showed that MHV SL2 
takes on a structure incompatible with a canonical U48-U49-G50 
U-turn loop structure, but better fits with an uYNMG(U)a-like or 
uCUYG(U)a-like tetraloop structure with U51 flipped out and G50 
stacked on A52 (Liu et al., 2009a). Furthermore, the structure of the 
SARS-CoV SL2 has been determined at atomic resolution showing 
that the SL2 pentaloop is stacked on 5-bp stem and adopts a canon- 
ical CUYG tetraloop fold with U51 flipped out of the stack (Lee et al., 
2011), making it likely that MHV SL2 has a similar structure. Fur- 
ther mutational studies of the loop demonstrate that any nucleotide 
replacement at C47, U49 and U51 can function and produce viable 
mutant viruses, whereas the G at position 50 is required (Liu et al., 
2007, 2009a). Mutational analysis of the SL2 stem demonstrates 
that the stem is required for viral viability whereas its nucleotide 
sequences are not important. RT-PCR analyses of the genomes con- 
taining lethal mutations in SL2 indicate that SL2 is required for 
subgenomic RNA synthesis (Liu et al., 2007). 


3.2.3. TRS 

The TRS regions of some coronaviruses, SARS-CoV and BCoV for 
example, are predicted to fold the leader TRS sequences into a hair- 
pin loop, designated SL3. For the related betacoronavirus MHV, 
a similar base pairing scheme for SL3 can be drawn (Chen and 
Olsthoorn, 2010), but the SL3 stem is not predicted to be stable 
at 37°C (Liu et al., 2007). Structural probing of MHV suggests that 
the TRS region is single stranded (Yang et al., 2015). The leader TRS 
has a key role in subgenomic mRNA synthesis. The discovery and 
cloning of DI RNAs enabled a series of experiments using in vitro 
transcribed DI RNAs containing a reporter gene under the control 
of either mutant and wild type TRS sequences to probe the sequence 
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requirements for leader-body joining during subgenomic RNA syn- 
thesis (Hiscox et al., 1995; Makino et al., 1991; van der Most et al., 
1994). These experiments demonstrated that there is a require- 
ment for a minimum degree of sequence similarity between the 
TRS-L and TRS-B for transcription to proceed. However the rela- 
tionship between the level of sequence similarity between TRS-L 
and TRS-B and the transcriptional activity at a TRS was not entirely 
straight-forward and thus additional factors are thought to play a 
role. In two elegant mutational studies employing a TGEV reverse 
genetic system this question was re-investigated in the context of 
infectious virus (Sola et al., 2005; Zuniga et al., 2004). Extending the 
region of potential base pairing between TRS-L and the complement 
of TRS-B to include 4 nts of TRS 5’ and 3’ flanking sequence allowed 
Sola et al. to predict the ability of each TRS sequence to promote 
transcription based on the Gibbs free energy of the base pairing of 
this region (Sola et al., 2005). 


3.2.4. SL4 

Previously, the Brian lab showed a BCoV stem-loop they desig- 
nated as SLIII mapping at nts 97 through 116 in the BCoV 5’UTR, 
which must be base-paired for BCoV DI RNA replication (Raman 
et al., 2003). Later Chen and Olsthoorn (Chen and Olsthoorn, 2010) 
employed a phylogenetic approach to predict the existence of SL4 
downstream of the TRS-L, nts 80 through 130 in MHV, which differs 
primarily from SL4 in the model predicted by Leibowitz/Giedroc 
group in that the proximal 6 nts at left side (nts 74-79) and right 
side (nts 139-134) of SL4 are base paired (Kang et al., 2006a; Liu 
et al., 2007). For MHV, SL4 was predicted by the Leibowitz/Giedroc 
group (Kang et al., 2006a; Liu et al., 2007) to be positioned just 3’ to 
the leader TRS and is the first proposed structural RNA element of 
the 5’UTR3’ of the leader (Fig. 1A). It is predicted to contain a bipar- 
tite stem-loop, SL4a and SL4b, separated by a bulge (Kang et al., 
2006a; Liu et al., 2007). SL4b in this model corresponds to SLIII in 
BCoV 5’/UTR (Raman et al., 2003). 

Mutations that disrupt the helix in the SLI in BCoV 5’UTR 
(Raman et al., 2003) led to loss of RNA accumulation of mutant 
DI RNA, whereas compensatory mutations that restore the struc- 
ture result in some level of RNA accumulation of double mutant DI 
RNA progeny. The Brian group also tested the functional role in viral 
replication of a short AUG-initiated intra-5’UTR ORF (this is repre- 
sented by the heavy line in Fig. 1) that is present in BCoV, potentially 
encoding an eight-amino-acid peptide which is phylogenetically 
conserved, especially among betacoronaviruses. They reported that 
the amino acid sequences of the intra-5’UTR ORF are important 
for BCoV DI RNA accumulation and there is a positive correlation 
between the maintenance of the short ORF and maximal DI RNA 
accumulation (Raman et al., 2003). The MHV 5’UTR also contains 
an eight-amino-acid small ORF in SL4b, identical to that present in 
the corresponding region of BCoV (Raman et al., 2003). Yang et al. 
(2011) performed an extensive mutational analysis of MHV SL4, and 
as part of that study demonstrated that the eight amino acid small 
ORF did not have a critical role in MHV replication. This is contrary 
to the finding in BCoV DI RNA replication assays that these ele- 
ments are necessary to continually passage BCoV DI RNA (Raman 
et al., 2003; Raman and Brian, 2005), but is consistent with a later 
study by the Brian group which demonstrated that there is positive 
selection for the small upstream ORF, but it is not essential for MHV 
replication in cell culture (Wu et al., 2014). The likely explanation 
for the conflicting results from DI assays and assays done with sim- 
ilar mutants in the context of a complete viral genome is that by 
their very nature DI replication assays are competition assays with 
helper wild type virus and recombinant WT DI RNAs that arise dur- 
ing the experiment. Thus the DI experiments may have detected 
subtle decreases in relative fitness in the BCoV DI RNA replication 
that are not detected in straight forward viral replication assays that 
focus on recovering viable viruses. The mutational study of SL4 by 


Yang et al. (2011) also indicates that for SL4b, neither the structure 
nor the sequence have a critical role in viral replication. The par- 
allel analysis of SL4a was consistent with the Chen phylogenetic 
based model of SL4 (Chen and Olsthoorn, 2010) leaving nts 74-79 
and 131-139 unpaired. However, deletion of the entire MHV 5’UTR 
SL4 is lethal and the genome carrying this deletion is defective in 
directing subgenomic RNA synthesis. A viable mutant in which SL4 
was replaced with a sequence unrelated stem-loop supports the 
hypothesis that SL4 functions in part as a “spacer element” and this 
spacer function plays an important role in directing subgenomic 
RNA synthesis during virus replication (Yang et al., 2011). 


3.2.5. SL5 

The Brian group determined that a stem loop that they desig- 
nated SLIV (corresponding to SL5A in Fig. 1) spanning nts 171-225 
and thus extending into the nsp1 coding sequence was required 
for optimal replication of MHV (Guan et al., 2011). Nucleotides 
238-262 and nts 284-309 make up a segment of the bulged stem- 
loop designated SL5B in a SHAPE-generated model (Fig. 1), and 
correspond to an identical stem-loop structure that was predicted 
as the basal segment of SLV (nts 238-262 and nts 284-309) for 
betacoronavirus by Mfold and by covariation analysis (Brown et al., 
2007), and as part of an unnamed stem-loop by a structural- 
phylogenetic analysis of betacoronavirus (Chen and Olsthoorn, 
2010). The SHAPE-generated model differs from earlier models in 
that nts 234-237 are base-paired with nts 314-310 in the base 
of SL5B but these nucleotides are not included in SLV (Brown 
et al., 2007) or the corresponding region in Chen model (Chen and 
Olsthoorn, 2010). Moreover, the structure predicted by SHAPE anal- 
ysis for nts 263-283 in the terminal part of SL5B agrees with that 
predicted by the Chen model (Chen and Olsthoorn, 2010), but is 
different from the stem-loop predicted for the terminal portion 
of BCoV SLV by Brown et al. (2007). BCoV SLV (nts 239-310) is 
also supported by RNase structure probing (Brown et al., 2007). 
SL5C (nts 315-334) is similar to the distal part of SLVI predicted by 
Brown et al. (2007) and to the corresponding region in the Chen 
model (Chen and Olsthoorn, 2010). BCoV SLVI has been identified 
as a cis-acting element required for DI RNA replication (Brown et al., 
2007). In a recent study (Yang et al., 2015), disruption of SL5C by 
four nucleotide substitutions while maintaining the WT amino acid 
sequence resulted in viable recombinant viruses with only moder- 
ate impairment of virus replication compared to that of the WT 
virus, suggesting that the cis-acting SL5C is not required for viral 
replication. This contrasts to the data obtained with a BCoV DI RNA 
model replicon, where the destruction of the distal helix that cor- 
responds to SL5C resulted in a failure of DI RNA replication (Brown 
et al., 2007). We have previously found similar discordant results 
with MHV DI RNAs and recombinant viruses containing identical 
mutations in their 3’UTRs (Johnson et al., 2005), and subsequently 
with recombinant MHVs containing SL4b mutations (Yang et al., 
2011) when compared with a BCoV DI RNA containing similar 
mutations (Raman et al., 2003). Currently we reason that the nature 
of DI replication assays, in which a mutant DI RNA must compete 
with helper wild type virus and recombinant WT DI RNAs that arise 
during the experiment, allow DI replication experiments to mag- 
nify modest decreases in fitness observed in straight forward viral 
replication assays that focus on recovering viable viruses, making 
them much more severe or lethal in DI RNA replication assays. 


3.2.6. SL6-7 

SL6 spans nts 376-446 in MHV-A59 and encodes nsp1 amino 
acid positions 56-79. The SHAPE-generated SL6 (nts 376-446) 
(Fig. 1) is remarkably similar to SLVIII, predicted by Mfold for beta- 
coronavirus (Brown et al., 2007), but no structural or functional 
evidence supported the SLVIII prediction. A mutational study (Yang 
et al., 2015) demonstrates that MHV SLG6 is not essential for viral 
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replication. In a previous study by Brockway and Denison (2005) 
a genome (designated VUSB5) containing nsp1 charge-to-alanine 
mutations in the base of the SL6 helix, nts 399-401, CGG —> GCA 
(R64A), and at nts 414-416 GAA — GCA (E69A), was not viable. 
These mutations are predicted to destabilize the distal stem of SL6. 
A second mutation, VUSB6, includes substitutions at nts 441-446, 
CGUGAU — GCAGCA (R78A, D79A), was also not viable. Because 
the RNA secondary structure in these putative cis-acting regions 
was not known at the time, Brockway and Denison (2005) could 
not unequivocally assign the functional effects of these mutations 
on replication to the amino acid alterations in nsp1. Modeling the 
effects of these mutations on the secondary structure of SL6, indi- 
cates that VUSB5 would destabilize the distal portion of SL6 and 
VUSB6 is predicted to completely open up the bottom of SL6. In 
light of the viability of the mutations that destabilize the distal 
region of SL6 (SL6-B) (Yang et al., 2015), and the ability of Brockway 
and Denison (2005) to recover viable viruses containing mutations 
(VUSB4) in nsp1 that are predicted to destabilize the base of SL6 
very much as did the lethal VUSB5, it is very likely that the lethality 
of VUSB5 and VUSB6 are due to their effects on nsp1 rather than 
due to effects on RNA secondary structure. 

SL6 and SL7 in MHV-A59 diverge somewhat from the corre- 
sponding structures in BCoV-Mebus, and are quite different from 
SL6-8 in SARS-CoV and SL6 and SL7 in MERS-CoV (Fig. 1). This is 
consistent with functional studies of SL6 which demonstrate that 
SL6 is not essential for MHV replication (Yang et al., 2015), in con- 
trast to structural elements that are entirely within the 5’UTR (SL1, 
SL2, SL4) or to the trifurcated SL5 stem-loop which extends from 
the 5’UTR into the nsp1 coding sequence, which are lethal or result 
in viruses that are crippled for viral replication (Guan et al., 2011, 
2012; Li et al., 2008; Liu et al., 2007, 2009a; Yang et al., 2011). 


4. 3’-cis-Acting RNA elements in coronavirus replication 
and transcription 


The coronavirus 3’UTR consists of 300-500 nts plus a poly(A) 
tail, depending upon the particular coronavirus examined. Initial 
replication assays with MHV DI RNAs indicated that the minimal 
length of 3’ sequence required for MHV DI RNA replication was 436 
nts, including part of the N gene and the entire 3’UTR 301 nts (Lin 
and Lai, 1993; Luytjes et al., 1996). The minimal sequences that 
support TGEV and IBV DI RNA replication are 492 nts and 338 nts, 
respectively, neither region includes any part of the N gene (Dalton 
et al., 2001; Mendez et al., 1996) and the presence of an accessory 
gene 3’ of the N gene in the alphacoronaviruses suggested that 
at least for these coronavirus genera the 3’UTR contains all of the 
signals needed for replication. Subsequent experiments in which 
the MHV N gene was separated from the 3’UTR in recombinant 
MHVs suggested that the 3’UTR contained all of the cis-acting 
sequences needed for replication in betacoronaviruses as well (de 
Haan et al., 2002). In a DI replication assay the minimal cis-acting 
signal essential for negative-strand RNA synthesis was only the 3’ 
most 55 nts of the genome plus the poly(A) tail (Lin et al., 1994). 
The poly(A) tail has been identified as an important cis-acting 
signal required for BCoV DI RNA replication, although as little as 
five As sufficed to initiate replication (Spagnolo and Hogue, 2000). 
The poly(A) tail has also been shown to be necessary for MHV 
minus-strand RNA synthesis (Lin et al., 1994). 

Aseries of studies utilizing RNA folding algorithms, biochemical 
studies, and functional studies have been used to investigate the 
structure and function of various cis-acting elements present in 
the 3’UTR (Goebel et al., 2004a, 2007; Hsue et al., 2000; Hsue 
and Masters, 1997; Liu et al., 2001, 2013; Stammler et al., 2011; 
Williams et al., 1999; Zust et al., 2008) and the current best model 
of the 3’/UTR is shown in Fig. 2 (Zust et al., 2008). The 5’-most 
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Fig. 2. A schematic drawing of the secondary structures of the 3’ untranslated 
regions of MHV and MERS-CoV. For the MERS-CoV the dotted line in the pseudoknot 
(PK) represent a potential non-canonical UC base pair. 


secondary structure is a 68 nts bulged stem-loop just downstream 
of the N gene stop codon, and it is essential for MHV DI RNA 
and viral replication (Hsue et al., 2000; Hsue and Masters, 1997). 
This bulged stem-loop is predicted to be conserved among the 
betacoronaviruses and the pairing, but not the primary sequence, 
of the four covariant base pairs, is critical for the function of the 
secondary structure (Goebel et al., 2004b; Hsue and Masters, 1997). 
3’ to the 68 nts bulged stem-loop is a hairpin stem-loop which can 
form a 54 nts hairpin-type pseudoknot, which is required for BCoV 
DI RNA replication (Williams et al., 1999). The pseudoknot is phy- 
logenetically conserved among coronaviruses, both in location and 
in shape but only partially in nucleotide sequence, indicating that 
it may function as a regulatory control element. Computer assisted 
inspection of the MERS-CoV sequence indicated it is present in this 
newly recognized betacoronavirus as well, although in this virus 
the pseudoknot may contain a non-canonical base pair (Fig. 2). 
Goebel et al. (2004a) demonstrated that in MHV and BCoV, the 
bulged stem-loop and pseudoknot are in part mutually exclusive 
structures because they partially overlap and cannot be formed 
simultaneously (see Fig. 2). The authors proposed that the bulged 
stem-loop and pseudoknot are the components of a molecular 
switch which has the potential to regulate a transition occurring 
during viral RNA synthesis, and supported this hypothesis by a 
series of reverse genetic experiments (Goebel et al., 2004a). 
Computer assisted modeling and biochemical probing of the 
RNA secondary structure of the last 166 nts of MHV downstream 
of the pseudoknot predicted a long multi-branch stem loop in this 
region of the genome (Liu et al., 2001). DI replication assays sug- 
gested that several of the stems in this region were functionally 
important. Paul Masters’ group subsequently employed a reverse 
genetic approach and showed that for MHV the long hypervariable 
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bulged stem-loop structure between nts 46-156 is not essential 
for viral replication, even though it contains an octanucleotide 
sequence, 5’-GGAAGAGC-3’, which is highly conserved in the 3’UTR 
of coronaviruses (Goebel et al., 2007). Based on a reverse genetic 
study combined with a phylogenetic comparison of the 3’UTRs of a 
number of coronaviruses Zust et al. (2008) developed an improved 
model for the 3’UTR which is shown in Fig. 2. This model contains 
a triple helix junction in which stems designated S3 and S4 flank 
the hairpin stem-loop (S2-L2 in Fig. 2) which participates in the 
pseudoknot. Stammiler et al. (2011) performed a series of biophysi- 
cal studies demonstrating that the pseudoknotted conformation is 
much less stable than the double-hairpin conformation, but suggest 
that stacking of the pseudoknot with the S3 helix can stabilize the 
pseudoknotted conformation allowing it to form. Consistent with 
the biophysical studies, a reverse genetic study of this three helix 
junction region suggested that S3 is essential for viral replication 
(Liu et al., 2013). However, mutations disrupting the S4 helix of the 
triple helix junction, or deleting most of the L3 loop are tolerated. 

For the alphacoronaviruses, although the pseudoknot is con- 
served the bulged stem loop (BSL in Fig. 2) that is 5’ to the 
pseudoknot is absent (Dye and Siddell, 2005). In the gammacoro- 
naviruses a stem-loop located at the upstream end of the 3’UTR is 
required for viral replication (Dalton et al., 2001). Although a nearby 
pseudoknot is present in gammacoronavirus, its functional impor- 
tance has not been established (Williams et al., 1999). Only in the 
betacoronaviruses are both the pseudoknot and the bulged stem- 
loop closely overlapped. Although the primary sequences diverge 
among the betacoronaviruses, the secondary structures are highly 
conserved and functionally equivalent (Goebel et al., 2004b; Hsue 
and Masters, 1997; Wu et al., 2003). MHV and BCoV 3’UTRs are 
interchangeable although the nucleotide sequences diverge by 31% 
(Hsue and Masters, 1997), and the SARS-CoV 3’UTR can replace its 
MHV counterpart without affecting viral viability (Goebel et al., 
2004b; Kang et al., 2006b). However, the viable chimeras could 
not be recovered when the MHV 3’UTR was replaced with either 
the TGEV 3’UTR or the IBV 3’UTR (Goebel et al., 2004b; Hsue and 
Masters, 1997; Kang et al., 2006b). 


5. Viral and cellular proteins binding to the 5’ and/or 3’ 
cis-acting RNA elements 


In the negative-strand discontinuous RNA synthesis model pro- 
posed by Zuniga et al. (2004), viral and/or cellular factors binding 
to cis-acting RNA elements in the genomic RNA 5/UTR, TRS-L 
and 3’UTR, might circularize the genome through RNA-RNA, or 
RNA-protein and protein-protein interactions, and thus produce 
a topology enabling base pairing between TRS-L and the nascent 
complementary TRS-Bs during synthesis of minus strand RNA. A 
fair number of host proteins have been reported to interact with 
these cis-acting signals and these are reviewed below. It should be 
noted that majority of this work has been performed with MHV 
and although it is likely that functionally important host proteins 
and the elements that they recognize are likely to be conserved, 
it is possible that there might be some differences among the four 
coronavirus genera. 

Two viral proteins have been shown to bind to the coronavi- 
rus 5'UTR, the N protein and nsp1 (Table 1). The MHV N protein 
binds with high affinity and specificity to the TRS-L and possesses 
helix unwinding properties that suggest a role in template switch- 
ing (Baric et al., 1988; Grossoehme et al., 2009; Keane et al., 2012; 
Nelson et al., 2000), consistent with the demonstration that the 
TGEV and SARS N proteins have RNA chaperone activity (Zuniga 
et al., 2007). It has also been suggested that N protein binding to 
TRS-L favors translation of viral RNAs (Tahara et al., 1998). The 
N binding site in the leader RNA sequence in MHV is specifically 


localized to nts 56-72 (UAAAUCUAAUCUAAACU), and this interac- 
tion plays an important role in the discontinuous RNA transcription 
unique to Nidoviruses (Baric et al., 1988; Grossoehme et al., 2009; 
Keane et al., 2012; Nelson et al., 2000; Stohlman et al., 1988). N pro- 
tein contains two structurally independent RNA-binding domains: 
an N-terminal RNA binding domain (NTD) and a C-terminal dimer- 
ization domain (CTD) (Hurst et al., 2010) linked by a Ser/Arg 
(SR)-rich linker. The N-terminal domain (NTD) of the CoV N protein 
functions as an RNA chaperone in vitro (Zuniga et al., 2007), binds to 
the TRS, and the NTD-TRS interaction is critical for efficient sgRNA 
synthesis in MHV (Grossoehme et al., 2009; Keane et al., 2012). The 
MHV NTD forms a high affinity (Kobs ~ 8 x 107 M7!) 1:1 complex 
with a TRS-containing RNA (5’-gAAUCUAAAC) and its complement 
(cTRS) (Grossoehme et al., 2009). A recent study showed that the 
NTD-TRS interaction involves N residues R125, Y127, and Y190 and 
anchors the adenosine-rich region in the 3’ end of the TRS RNA to 
the B-platform of N and that this interaction is critical for efficient 
sgRNA synthesis (Keane et al., 2012). This same study also showed 
that the IBV and SARS-CoV N protein NTD shows limited binding 
specificity for their cognate TRS sequences (Keane et al., 2012). Thus 
it is not clear that the specific binding of N protein to the TRS over 
and above its general RNA binding activity plays a role in sgRNA 
synthesis for all coronaviruses. The second viral protein that has 
been shown to bind to the 5’/UTR is nsp1 (Gustin et al., 2009). The 
BCoV nsp1 protein has been determined to bind to three cis-acting 
stem loops in the 5’UTR, including SLIII, which corresponds to SL4b 
in our model and to regulate viral RNA translation and replication 
(Gustin et al., 2009). It is likely that the closely related MHV nsp1 
protein has a similar function. 

A number of host proteins have been shown to bind to the 5’UTR 
as well (Table 1). The pyrimidine tract-binding protein (PTB) has 
been reported to bind to the pentanucleotide repeat UCUAA in the 
positive-strand MHV RNA leader TRS and a role in subgenomic 
MRNA synthesis was suggested based on a correlation between 
binding efficiency to different DI RNA constructs and the expres- 
sion of a reporter under the control of a body TRS (Choi et al., 
2002; Li et al., 1999). For TGEV PTB was also identified as bind- 
ing to the genomic 5’UTR using an RNA affinity-mass spectroscopy 
approach (Galan et al., 2009). Heterogeneous nuclear ribonucleo- 
protein A1 (hnRNP A1) binds to the complement of the 5’ leader 
sequence (negative-strand leader) and to the complement of the 
TRS-B sequences (Choi et al., 2002; Li et al., 1997, 1999). The func- 
tional importance of this binding has been controversial, Shen and 
Masters (2001) tested the role of hnRNP A1 in MHV replication 
by investigating the ability of MHV to replicate in cells lacking a 
functional hnRNP A1. The infected cells supported viral replication 
and synthesized normal levels of genome and subgenomic RNAs, 
suggesting that hnRNP A1 is not required for MHV discontinuous 
transcription or genome replication. However, it has been shown 
that multiple other type hnRNPs, including hnRNP A2/B1, hnRNP 
A/B, and hnRNP A3, bind to the negative strand complement of the 
MHV leader TRS (Shi et al., 2003) and that overexpression of hnRNP 
A/B resulted in a 4-5 fold enhancement of viral RNA synthesis, sug- 
gesting that these proteins might also facilitate RNA synthesis and 
be able to substitute for hnRNP A1. Another member of the hnRNP 
family, synaptotagmin-binding cytoplasmic RNA-interacting pro- 
tein (SYNCRIP), similarly binds to the MHV 5’UTR and to its negative 
strand complement (Choi et al., 2004). The L-TRS sequence has 
been shown to be necessary but sufficient for SYNCRIP binding and 
siRNA knockdown of SYNCRIP delayed the time of peak viral RNA 
synthesis in MHV infected cells (Choi et al., 2004). 

Similarly to the 5’UTR, multiple proteins have been demon- 
strated to bind to the 3’UTR. A reverse genetic study by Zust et al. 
(2008) revealed that a 6 nt insertion between the 3’UTR pseudoknot 
and the S3 helix (see Fig. 2) crippled MHV replication but they iden- 
tified second site mutations in nsp8 and nsp9 which restored viral 
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Table 1 
Host and viral proteins that bind to 5’ and 3’ cis-acting regions. 








Proteins bound Viruses Binding locations in virus genome 5’ end Binding locations in virus genome 3’ end 
Viral proteins 

MHV 5’UTR TRS-L in (+)RNA, specifically to nts 
N 56-72 (UAAAUCUAAUCUAAACU) 

TGEV, SARS-CoV Specificity not determined, functions as RNA 

chaperone 
Nsp1 BCoV 5’UTR in (+)RNA, specifically to SLIII and its 
flanking sequences 
Nsps 8 and 9 MHV Interaction with the 3’UTR in (+)RNA 
established genetically, binds specifically to 
the double hairpin adjacent to S3 

Host proteins 

MHV 5’UTR TRS-L in (+)RNA, specifically to 3/UTR in (-)RNA, specifically to nts 53-149* 
PIB pentanucleotide repeat UCUAA 

TGEV 5'UTR in (+)RNA 
hnRNP A1 MHV 5'UTR TRS-L and TRS-B in (-)RNA 3'UTR in (+)RNA 
SYNCRIP MHV 5'UTR in both strands 
hnRNPs (A2/B1, A/B, A3) MHV TRS-L in (-)RNA 
hnRNPs (A1, AO, A2/B1, Q, U); PABP; p100 TGEV 3' end of genome in (+)RNA 

transcriptional co-activator protein; 
arginyl-tRNA synthetase; EPRS 

Mitochondrial aconitase, HSP70, HSP60, MHV 3' most 42 nts in (+)RNA, specifically to 11 


HSP40 


nts UGAAUGAAGUU at position 26-36* and 
to 38 nts at 129-166* 





replication to near normal levels. Based on this genetic result they 
suggested a model in which a complex of the nsp8 primase plus 
the associated proteins nsps 7, 9, and 10 binds to the double hair- 
pin conformation of the 3’UTR just adjacent to S3 to initiate minus 
strand RNA synthesis. The nascent newly synthesized minus strand 
RNA would displace the 3’ half of the S3 helix and permit the for- 
mation of the RNA pseudoknot, also allowing binding of the main 
replicase complex containing nsp12, the main coronavirus RdRP, 
nsp13 (helicase), plus the associated RNA modification enzymes 
nsp 14-16 (Bhardwaj et al., 2004; Decroly et al., 2008; Ivanov and 
Ziebuhr, 2004; Minskaia et al., 2006; Snijder et al., 2003). 

A number of host proteins have been reported to interact 
with the 3’/UTR. Yu and Leibowitz reported four host proteins 
binding to the MHV 3/-most 42 nts RNA probes using RNase 
protection/gel mobility shift and UV cross-linking assays and a con- 
served 11 nts UGAAUGAAGUU sequence spanning position 26-36 
in the 3’/UTR (note that in this numbering system position 1 is 
the first nt upstream of the poly(A) tail) was necessary for pro- 
tein binding activity and for efficient DI RNA replication (Yu and 
Leibowitz, 1995a,b). A second protein binding region was similarly 
mapped within a 38 nucleotide (nt) sequence 166-129 nucleotides 
upstream of the 3’ end of the MHV genome and was also found 
to be necessary for efficient DI RNA replication (Liu et al., 1997). 
Subsequent studies determined that the proteins binding to the 
MHV 3’-most 42 nts element include mitochondrial aconitase and 
the chaperones mitochondrial HSP70, HSP60 and HSP40 (Nanda 
et al., 2004; Nanda and Leibowitz, 2001). PTB has been shown to 
bind to a negative-strand RNA complementary to the MHV 3’UTR 
at position 53-149 and less strongly at positions 270-307 (Huang 
and Lai, 1999). Deletions in the 53-149 binding site that abolished 
PTB binding also strongly inhibited subgenomic mRNA synthesis 
in an MHV DI construct containing a reporter gene under the con- 
trol of a TRS sequence (Huang and Lai, 1999). In addition to binding 
to the 5’UTR, hnRNP A1 has two binding sites in the MHV 3’UTR 
and these binding sites are complementary to the PTB binding 
sites in the negative sense 3’UTR enumerated above (Huang and 
Lai, 2001). DI RNAs containing a mutated hnRNP A1-binding site 
had reduced RNA transcription and replication activities. Using an 
RNA affinity-mass spectroscopy approach Galan et al. (2009) iden- 
tified nine host proteins, including several hnRNPs (A1, AO, A2B1, Q, 
and U), the glutamyl-prolyl-tRNA synthetase (EPRS), arginyl-tRNA 


synthetase, poly(A) binding protein (PABP), and the p100 transcrip- 
tional co-activator that bound to the TGEV 3’UTR. A possible role 
for these proteins in TGEV replication was suggested by the siRNA- 
mediated knockdown of PABP, hnRNP Q and EPRS expression with 
a concomitant 2-3-fold decrease in TGEV RNA synthesis and viral 
titer in infected cells. Spagnolo and Hogue (2000) demonstrated 
an interaction of PABP with the poly(A) tail is required for BCoV 
and MHV DI RNA replication, and a tail length of 5 is sufficient to 
support DI replication. 


6. Interactions between 5’ and 3’ ends and TRS 


The widely accepted model of coronavirus discontinuous tran- 
scription of subgenomic RNA postulates that leader body joining 
occurs during the synthesis of minus strand RNAs (Baric and Yount, 
2000; Sawicki and Sawicki, 1990; Sethna et al., 1989). Zuniga et al. 
(2004) have provided strong support for this model by showing 
that in TGEV the body TRS sequences upstream of each gene sig- 
nal template switching to the TRS 5’ leader (TRS-L) and proposed a 
refinement of the model in which the 3’ and 5’UTRs interact through 
RNA-RNA and/or RNA-protein plus protein-protein interactions 
to promote circulization of the coronavirus genome to place the 
elongating minus strand in a favorable topology for leader-body 
joining. 

Several candidate RNA-protein interactions have been identi- 
fied and proposed to contribute to circulization of the genome. 
Spagnolo and Hogue (2000) suggested that the interaction of PABP 
bound to the coronavirus 3’ poly(A) tail might result in the circu- 
larization of the coronavirus genome through its ability to interact 
with eIF-4G, a component of the three-subunit eIF-4F cap bind- 
ing protein that binds to MRNA cap structures during translation 
(Sonenberg, 1996; Sonenberg et al., 1978). An interaction between 
PTB bound to leader (and body) TRS has also been postulated to 
play a role in coronavirus transcription by mediating an interaction 
between the TRS and the 3’UTR by binding to hnRNP A 1 bound to 
its protein binding sites in the 3’UTR (Lai, 1997, 1998). Although 
this is an attractive model, the fact that a deletion encompassing 
the high affinity hnRNP A1 binding site in the 3’UTR is able to repli- 
cate and direct normal synthesis of subgenomic mRNAs makes the 
PTB-hnRNP A1 association less likely to have a crucial role in leader- 
body rejoining (Goebel et al., 2007), although the possibility that 
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other members of the hnRNP family could substitute for hnRNP 
A1 remains a possibility (see Section 5 for a discussion of this). 
Li et al. (2008) reported that viruses recovered after deletion of a 
bulged A35 in the lower portion of the 5’UTR SL1 stem contain addi- 
tional second site mutations, A29G or A78G in the 3’UTR, providing 
genetic evidence in support of an interaction between the 5’ and 
3’UTRs. They proposed a dynamic SL1 model in which the base of 
SL1 has an optimized lability required to mediate a physical interac- 
tion between the 5’UTR and the 3’UTR that stimulates subgenomic 
RNA synthesis. In unpublished work P. Liu and Leibowitz identi- 
fied a potential base pairing between nucleotides 8-24 in the MHV 
5’UTR SL1 and two discontinuous sequences in the 3’UTR, nts 1-6 
and 218-228 (note that the 3’UTR sequences are numbered with 
position 1 corresponding to the first nucleotide 5’ of the poly (A) 
tail). An extensive mutational analysis of these sequences failed 
to provide genetic support for a functional role for this potential 
5/-3’UTR interaction in MHV replication. Thus the precise mecha- 
nism by which the 5’ and 3’UTRs associate during viral replication 
remains to be functionally defined. 


7. Conclusion and future directions 


A series of studies utilizing consensus covariation modeling, 
chemical probing, nuclear magnetic resonance (NMR) spec- 
troscopy, and SHAPE analysis have identified and characterized 
individual or small numbers of RNA secondary structures in the 
cis-acting region containing the MHV 5’UTR and extending into the 
N-terminal nsp1 coding sequence, TRS, 3’UTR and poly(A) tail and 
many of the studies have examined their functional roles in viral 
replication (Chen and Olsthoorn, 2010; Guan et al., 2011, 2012; 
Kang et al., 2006a; Li et al., 2008; Liu et al., 2007, 2009a; Yang et al., 
2011). A detailed understanding of the RNA structures within the 
cis-acting sequences in the 5’UTR and 3’UTR, and eventually the 
entire MHV genome, and of the RNA-RNA(s), RNA-protein(s) inter- 
actions that direct viral RNA synthesis and virus replication will 
assist our understanding of these processes. Understanding these 
interactions in highly pathogenic coronaviruses, such as SARS-CoV 
and MERS-CoV, may enable the design of small molecule inhibitors 
of these replicative processes. 
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